Factor oracle, Suffix oracle (Extended Abstract)
نویسندگان
چکیده
We introduce a new automaton on a word p, sequence of letters taken in an alphabet , that we call factor oracle. This automaton is acyclic, recognizes at least the factors of p, has m + 1 states and a linear number of transitions. We give an on-line construction algorithm of the factor oracle. The tight links between this structure and the suux automaton allows us to introduce a second structure, the suux oracle. We use these two structures in string matching algorithms that we conjecture optimal according to the experimental results. These algorithms are as eecient as the ones that already exist using less memory and being more easy to implement.
منابع مشابه
Statistical Properties of Factor Oracles
Factor and suffix oracles have been introduced in [1] in order to provide an economic and efficient solution for storing all the factors and suffixes respectively of a given text. Whereas good estimations exist for the size of the factor/suffix oracle in the worst case, no average-case analysis has been done until now. In this paper, we give an estimation of the average size for the factor/suff...
متن کاملA detail analysis on factor oracle construction of computing repeated factors
We show a detail implementation for a linear time and space method, introduced in [3], to compute the length of a repeated suffix for each prefix of a given word p. This method is based on the utilization of the factor oracle [1] of p, which is deterministic acyclic automata accepting all subustrings of p. keyword: factor oracle, suffix link, repetition
متن کاملCombinatorial Characterization of the Language Recognized by Factor and Suffix Oracles
Sequence Analysis requires to elaborate data structures which allow both an efficient storage and use. Among these, we can cite Tries [1], Suffix Automata [1, 2], Suffix Trees [1, 3]. Cyril Allauzen, Maxime Crochemore and Mathieu Raffinot introduced [4, 5, 6] a new data structure, linear on the size of the represented word both in time and space, having the smallest number of states, and allowi...
متن کاملError analysis of factor oracles
Factor oracles [1] constructed from a given text are deterministic acyclic automata accepting all substrings of the text. Factor oracles are more space economical and easy to implement than similar data structures such as suffix tree[6]. There is, however, some drawback; a factor oracle may accept strings not in the text, which we call a error acceptance. In this paper, we charactrize factor or...
متن کاملA new taxonomy of sublinear keyword pattern matching algorithms
This paper presents a new taxonomy of sublinear (multiple) keyword pattern matching algorithms. Based on an earlier taxonomy by Watson and Zwaan [WZ96, WZ95], this new taxonomy includes not only suffix-based algorithms related to the Boyer-Moore, CommentzWalter and Fan-Su algorithms, but factorand factor oracle-based algorithms such as Backward DAWG Matching and Backward Oracle Matching as well...
متن کامل